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ABSTRACT 



Training fn tests and measurementf grade level taught, and subject are^ taught 
were all ' ound to have sfgnlffcant effects on teachers' use of different types of 
tests and test items. A random sample of 555 practicing teachers in the State of 
Wyoming participated (81% response rate). Results suggest that flexibility in 
testing is enhanced by training in tests and measurement beyond the typical basic 
course* Results also provide information which may be ufsd in tailoring tests and 
measurement courses and are discussed in those terms • 
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Teacher use of tests 



Testing fn U.S. schools continues to be practiced extensively, though debate 
continues on Its place and Its value. Given the widespread use of tests with Its 
potential to help or to hinder. It Is essential that the assessments made and the 
methods or tnstruments used to make them be both of hfgh quality and appropriate 
to the situation and purpose. Students' motivation to achieve and their 
perceptions of the educational system may be damaged by Inadequate testing 
practice? at any level. Recently several authors have argued that college 
training In tests and measurement may not be adequately oriented to what teachers 
actually need, thus limiting their '^aclllty In using appropriate testing 
techniques (Ebe', 1967; Fennessey, 1982; GulUckson, 1984b; Newman & Stal lings, 
1982). Gulllckson (1984b) calls for the development of strategies to meet 
teachers' needs but notes that prerequisite to this Is sinply a description of 
teachers' testing behavior. 

Studies of testing practice In the U.S. have consistently found extensive test 
use. Carlberg (1981) reported that 151 of class time was devoted to testing. In 
a survey conducted by Newman and Stal lings, (1982), teachers reported spending 
more than 10% of their time dealing with tests. Gulllckson (1982) found th-at 95% 
of the teachers he surveyed tested at least biweekly. The estimated average 
percentage of students' course grades which are based on test scores Is 40-50% 
with a range of 0-100% (Gulllckson, 1984b; HcKee & Hannlng-Curtls, 1982; Newman & 
Stalilngs, 1982). Tests.- thus, are used frequently. But how are they used—and 
how does their use vary with tests and measurement training, content area, and 
level taught? 
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Teachers Use of Tests 2 

Several studlf i have been coixlucted relating flexibility In testing practice 
to training* grade level, and subject taught. The number of purposes for which 
tests are used and the number of Item types used were found to relate to knowledge 
of measurement principles (Newnnan & Stal lings, 1982). Those teachers with higher 
knowledge scores tended to use tests for more purposes and to use more Item 
types. However, Fennessey (1982) found no relationship between training and the 
nunber or types of tests used. 

Grade level taught has been found to be related to test use and to attitudes 
toward testing. Fewer tests were found to be given at lower than at higher grade 
levels (Gultlckson, 1982; Yeh et al., 1981) and attitudes toward testing were less 
positive at the lower grade levels (Tollefson et al., 1985). 

Use of Item types and evaluation techniques have also been found to vary 
across grade levels and subject area taught (Chambers, 1982; Gullfckson, 1984a). 
Newman and Stal lings (1982), for example, found teachers to use completion Items, 
multiple-choice, matchingt true-false, short answer, and essay questions (from 
most to least frequently). Gullickson (1982) found teachers to use objective item 
types most, followed by essay items. Use of textbook/teachers^ manuals as item 
sources decrease^ as grade level increased. 

The purpose of this paper was to examine classroom test and Item use by level 
of training, grade taught, and subject area taught. Hypotheses were: 

1. There are significant differences In the number of item and test types 
used among teachers wi^.h 0, I, 2, and 3-f courses In tests and measurement, 
with flexibility increasing as amount of training increases. 



Teachers Use of Tests 2 

2. There are sfgnlffcant differences In the number of Item and test types 

used among teachers at the elementary, junior high, and senior high 
levels. 

3. There are significant differences In the number of Item and test types 

used among teachers In different subject areas (art/music, English, 
physical education, social science, mathematics, and science). 

METHODS 

Instruments 

A survey form was developed containing questions about training In tests and 
measurement, subject areas and grades taught, from what source test Items are 
taken, hours spent In testing-related activities, the percent of students' grades 
based on test scores, use of six types of test Items, and use of five types of 
tests. 

Types of Items used was treated both as an aggregate V'.TO) and as six 
separate variables. Questions asked for frequency of use (Unever. 6=always) of: 

- true-false questions 

- essay questions 

- multiple-choice questions 

- short answer questions 

- completion questions 

- matching questions 

Types of tests used was also treated as an aggregate K=.66) and as five 
separate variables. Questions assessed the frequency of use cf: 
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Teachers Use of Tests 
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- diagnostic tests 

- norm-referenced tests 

- crlterlon-^referenced tests 

- performance tests 

- competency tests 
Subjects 

Our goal was to survey approxlmatt^.y 500 teachers—a sample size adequate to 
allow analyses by grade level. The size of the sample was baseCi on expectations 
of a 70% return rate. A systematic random sample was chosen from the State 
Department of Education list of all licensed educators. During the spring 
semester, these teachers were sent a letter explaining the nature of the study, a 
survey form, and a stamped return envelope. A return rate of 55X was obtained from 
the first mailing. With two follow-ups, a total of 555 replies were received, or 
81% of the deliverable envelopes. (Twelve were undel Iverable, 4 refused, and 133 
did not reply.) 

The sample Included a greater percentage of females (64%), primarily as a 
consequence of the over-representation of females among elementary school 
teachers. The greatest percentage of teachers In the total sample and at each of 
the three grade levels was In the 30-39 year-old range. The average number of 
years of teaching experience was 12. All teachers In the sarrple held bachelors' 
degrees, with 23X holding masters'. Subject area responsibilities seemed 
representative of public school teachers: the majority of elementary teachers 
were responsible for all areas; at the Junior ard senior high levels the most 
frequently reported areas were In core subjects (English, math, science, social 
studies, physical education, art/music). Tralrjing In tests and measurement was 
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Teachers Use of Tests 

consistent across grade levels taught: 271 had no coursework. All had one 
course, 17% had two courses, and 9% had three or more courses. 



Analyses 

Hean frequency of test and Item use were calculated by amount of training, by 
grade level, and by content area. The significance of differences In usage were 
assessed using multivariate analysis of variance followed by univariate tests. 



RESULTS 



Significant multivariate effects of coursework In tests and measurement were 
found for types of tests used (''ig, ,287=5.22, p<.01) but not for use of item 
types, providing partial support for hypothesis 1. Persons with more coursework 
reported more frequent use of all types of tests, with major Increases occurring 
between groups with 0-1 and 2-3+ courses. 

Hypothesis 2 was supported: There were significant multivariate differences 

across grade levels taught In both use of different Item types (F,. =8.85, 

lZf972 * * 

P<.01) and use of different types of tests (f ,o,882='*- P<-01). Use of 
true-false and essay items increased significantly from kindergarten through the 
6th grade and continued to increase through high school. Use of multiple-choice, 
short answer, completion, and matching items Increased through grade 4 and then 
dropped slightly In grades 5 and 6. With the exception of short answer Items, 
differences In use of these item types at upper grade levels were not 
significant. Use of short answer Items increased significantly between the lower 
and the upper grade levels. Use of diagnostic tests was highest In grades 1-4. 
Use of other types of tests did not differ significantly across the elementary 
grade levels. Use of both norm- and criterion-referenced tests decreased 
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btgntftcantly across the upper grade levels while use of performance and 
competency tests did not. Elementary grade teachers reported a significantly 
heavier reliance on teachers' manuals as a source of test Items than did teachers 
at upper grade levels and spent significantly less time per week In 
testing-related actlvltles(4.7 hours per week vs. 7.4 at Junior high and 6.S at 
senior high). 

Table 1 presents the same Information broken down by content area. 
Elementary teachers were excluded from this analysis. Univariate F-statlstlcs 
and significance levels are noted for each variable In Table I. Hypothesis 3 was 
supported: Significant multivariate differences were found for both use of Item 
typ^s (^2Q^Q20'^.36, p<.01) and use of test types (F25, 692^5. 00, p<.OI). 

(Table I about here) 

DISCUSSION 

The purpose of this study was primarily to describe differences across grade 
levels and subject areas In use of different types of tests and test Items. The 
results of significance tests Indicate that there are clear differences I.1 
testing techniques used by teachers at different grade levels In different 
subjects. This result Is consistent with those found by Chambers (1982) and 
Gulllckson (1984a). The fact that significant dlfferencP3 exist In testing 
practices across grade level and content area is not surprising: different 
testing techniques lend themselves more readily to the assessment of different 
skills. This study serves to describe and highlight the differences. 

At the elementary levels, diagnostic tests are used frequently, tests being 
developed with the aid of teachers' manuals. Standardized tests are used 
extensively at this level as well. Techniques for early diagnosis and 
remediation are essential knowledge for elementary levei teachers. Completion 
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and matching Items are used more frequently at this level than other Item types. 
At the intddie school level, performance tests and competency tests are used more 
frequently as are short answer Items. However, the entire range of Item types, 
both subjective and objective, comes Into play. At the high school level, 
objective Item types as well as essay and short answer are all used. Diagnostic 
(and standardized) tests are used less frequently. Given the limited amount of 
time devoted to tests and measurement In college curricula. It Is appropriate to 
emphasize different types of tests and Items In course sections offered for 
prospective elementary, middle school, and high school teachers. Alternatively, 
tests and measurement Instructors may need to demonstrate by concrete example how 
all test and Item types can be useful at all levels. (This Information Is not 
provided In detail in the major textbooks.) 

Use of test and Item types at the high school level varies with area taught. 
English teachers reported more extensive use of subjective than of objective Item 
types; short answer Items were used most frequently by mathematics teachers. In 
most areas, tests were given to assess performance or achievement more frequently 
than to diagnose difficulties. Fennessey (1982) argues that tests and 
measurement training should be focused on the student's currlcular area— English, 
physical education, mathematics— whenever possible as well as being structured to 
respond to needs of prospective elementary, middle school, and hish school 
teachers. Such structuring of training would Involve both great flexibility on 
the part of Instructors and skill on the part of persons who schedule students. 

. The use of tests to determine students' grades varied widely across subject 
taught, so even though emphasis on test construction seems appropriate for 
classes of prospective high school teachers, for students In fields such as art 
and music, instruction In a ternatlvc assessment techniques Is necessary. 
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Teachers Use of Tests 

8 

GulllcKson (198^, po,„ts out that nontest evaluative techniques ,lab re„rts. 
papers, are used at a, , grade levels. In areao such as art. ™slc. dance. 
Physical education, and English where less emphasis Is placed on test results 
than In science and ™th. It Is even more Important to Instruct prospective 
teachers In evaluation techniques other tl«n paper-and-penc i 1 tests. 

Differences In test use were found between teachers with two or more test, 
and o^asurement courses and teachers with no course-ork or one course. This 
soggestc that coorsework beyond the typical undergraduate Introductory course 
"Ml be needed to effect b.ihavloral change. As noted earlier, testing In U.S. 
schools IS extensive with tests being used frequently by both those with and 
without formal training. Optl«„ use of tests requires advanced training. This 
training may perhaps be provided after the teacher has had a year of more's 
experience rather than as an u^ergra*ate. The majority of baccalaureate 
programs In teacher education r«n.lre completion of one course In tests and 
measurement. This single coorse does not seem to have the Inpact on practice 
that one might wish for. It Is suggested that this course (and acco^nylng 
texts) be restructured to provide more Information directly pertinent to 
Classroom teachers- „e«.s. perh«,s by accompanylr., a basic volume by a second 
vol«e composed largely of concr.-te ex»ples. or that additional training be 
provided to allow teachers practice In optimal testing techniques. 
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Tjble 1. Use of types of Items and tests by grade level taught 
(means and standard devfatfons) 

Content Area 

Item TvR gs Art/Musfc English PE Social Hath Science F o 

Number of cases 29 57 26 37 42 27 

True-false 2.2 2.6 3.4 3.0 2.1 2.7 6.78 .01 

^ , , . < -9) (1.5) (1.1) ( .9) (1.0) 

Hu.ttple-cholce 3.2 3.4 3.3 4.2 2.7 3.7 7.67 01 

^ , ^. i'-2> (1.1) (1.4) ( .9) (1.1) ( .9) 

Completion 3.3 3.2 3.3 3.7 3.0 3.9 2.40 .04 

^ ^, d.'') (l.Z) (1.6) (1.1) (1.2) (1.2) 

Marching 3.3 3.3 3.0 3.9 2.6 3.7 f 16 .01 

(1.5) (1.0) (1.5) (1.0) ( .7) (1.0) 

2.5 4.0 2.7 3.8 1.8 3.6 IB.a .01 

(»-6) (1.3) (1.4) (1.3) (1.0) (1.3) 

Short answer 3.4 4.0 3.5 4.4 3.4 4.4 5.46 .01 

(1.7) (1.0) (1.5) ( .9) (1./) ( .8) 

AGGREGATE 18.0 20.4 19.0 23.0 15.7 22.0 14.90 01 

(5.3) (3.5) (7.0) (3.4) (3.5) (3.1) 



Test Types 

Diagnostic 1.8 3.0 1.6 2.8 3.1 2.3 8.30 .01 

^ ^ (1-0) (1.3) ( .9) (1.3) (1.3) (1.1) 

Norm-referencxi K4 1.9 1.7 1.9 2.0 2 0 - NS 

Cr Iter Ion-referenced 

tests— frequency 2.1 2.2 2.3 2.3 2.6 2 4 - nS 

(1.4) (1.3) (1.6) (1.4) (1.6) (1.4) 
~ X using 48.3X 35. 2X 44.01 48.6X 45.2% 51. 9X 

Performance 3.9 3.1 4.5 2.4 3.7 3.0 9.62 .01 

^ ^ ('-.2) (1.3) (1.0) (1.2) (1.6) (12) 

Competency 3.3 2.4 2.7 2.4 3.1 2.5 2.39 .04 

(1.5) (1.2) (1.4) (1.3) (1.6) (1.3) 

AG<3REGATE 12.1 12.0 12.4 11.0 13.8 12.3 - NS 

(3.6) (4.3) (3.0) (4.7) (4.7) (4.6) 

Sources of test Items: 

Construct own 69.5% 63,3% 68.8% 63.7% 61.9% 61.8% - NS 

(28.3) (24.8) (25.8) (26.2) (33.1) (28.0) 

Use manuals 22.7% 34.7% 29.5% 31.4% 39.9% 35.4% - NS 

(25.4) (23.0) (20.4) (24.4) (31.8) (22.2) 

Time (hours) 5.2 8.5 5.5 6.5 7.1 7.4 2 59 03 

spent In test- (4.9) (5.5) (4.5) (3.4) (3.3) (3.5) 
related activities per week 

i^S!S*«S^5!2'^ ^^^-2* ^^.5% 50.1% 7.74 .01 

_ iHa»— ?.ed on,t«»t 119,1) (17.7) (14.2) (21.4) <18.9) 



